【week15】820. Short Encoding of Words 解题报告

820. Short Encoding of Words

题目

Given a list of words, we may encode it by writing a reference string S and a list of indexes A.

For example, if the list of words is ["time", "me", "bell"], we can write it as S = "time#bell#" and indexes = [0, 2, 5].

Then for each index, we will recover the word by reading from the reference string from that index until we reach a "#" character.

What is the length of the shortest reference string S possible that encodes the given words?

Example:

1
2
3
Input: words = ["time", "me", "bell"]
Output: 10
Explanation: S = "time#bell#" and indexes = [0, 2, 5].

Note:

  1. 1 <= words.length <= 2000.
  2. 1 <= words[i].length <= 7.
  3. Each word has only lowercase letters.

解题报告

题意不太简单,绕了一下。

很容易想到,如果一个单词 $words[i]$ 可以被 $words[j]$ 代替,那么它必然是 $words[j]$ 的子串。而且是从尾部开始的子串。

问题是怎么检测出两个单词间的关系,暴力一点可以直接两层循环 $O(N^2)$ ,但是这样太慢了,需要找一种方法快速剔除子串。这样我想到了哈希,首先可以把所有单词放进一个 unordered_set 里,然后把剔除每个 $word$ 的 每个子串,这看起来很麻烦,但其实并不会花费很多时间,因为 unorder_set 的操作是 $O(1)$ 的,而每个单词的长度不超过 7,几乎可以忽略不计,所以基本上整个剔除操作几乎是 $O(N)$ 的。而后面的计算答案就不用说了,时间代价也是 $O(N)$,所以总的时间代价是 $O(N)$ 的。

当然,这么做也不是最好的,还有一种方法是建立 Trie 树,但是这个我不会。。看评论区还有一种巧妙的算法是反转+排序:反转每个字符串,然后排序,这样只需检查 $sorted[i]$ 和 $sorted[i+1]$ 的关系即可,感觉也很独特。

Solution

1
2
3
4
5
6
7
8
9
10
11
12
13
class Solution {
public:
int minimumLengthEncoding(vector<string>& words) {
unordered_set<string> us(words.begin(), words.end());
for (auto w: words)
for (int i = 1; i < w.size(); ++i)
us.erase(w.substr(i));
int ret = 0;
for (auto s: us)
ret += s.size() + 1;
return ret;
}
};
CFLP 两种解题算法 【week14】397. Integer Replacement 解题报告

评论

Your browser is out-of-date!

Update your browser to view this website correctly. Update my browser now

×